Microbiome Boot Camp

Utah Valley University - BIOL490R

Course goals

  • Understand the data structure of microbiome studies
  • Demonstrate how to process microbiome data into a usable format
  • Explore processed microbiome data
  • Test hypotheses
  • Write a paper putting results into context of other research

Course requirements

  • Laptop with R and R Studio installed
  • Previous experience with R programming
    • data cleaning / transformations
    • plotting with ggplot
    • file management
    • model training and evaluation

Course setup

  • We have a real data set and some question to address:
    • Pando foliar fungi
    • Are trees selecting fungal endophyte or epiphyte community?
    • Do foliar fungi follow stochastic assembly?
    • What spatial structure is there?
    • Hypotheses:

  • We will use this data set to learn how to process and analyze microbiome data
  • We will then write a paper as a class to present our findings for peer-review
Image from Wiki commons

Image from Wiki commons

“Pando is believed to be the largest, most dense organism ever found at nearly 13 million pounds. The clone spreads over 106 acres, consisting of over 40,000 individual trees. The exact age of the clone and its root system is difficult to calculate, but it is estimated to have started at the end of the last ice age. Some of the trees are over 130 years old. It was first recognized by researchers in the 1970s and more recently proven by geneticists. Its massive size, weight, and prehistoric age have caused worldwide fame.”

— US Forest Service


We are using Pando as a natural laboratory to study the biogeography and assembly of fungi associated with the leaves of plants. These fungi, known as endophytes (inside the leaves) and epiphytes (on surface of leaves) are important components of the plant microbiome. They modify plant disease severity, alter plant phenotype, and can even help plants resist common stressors such as drought. One important question regards where plants get their foliar fungi. In grasses, they are passed along inside seeds (vertical transmission), but in dicot plants like Pando, they are assembled from the environment.

They blow in on the wind, or are carried by animals, or by raindrops. But it is interesting that not all fungi can have a healthy stable relationship with all plants. There is likely some environmental filtering going on… the abiotic environmental variables or the plant itself are possibly doing some selecting of which fungi make it into plant leaves.

Most studies of this nature have to contend with the fact that when you go out into the field, each plant you sample has a different genotype, even if you sample the same plant species. We don’t have that problem in this study since every tree that’s part of Pando is just a piece of the same genetic clone individual.

What this means is that we can sample leaves from Pando and can study the spatial structure of the fungal communities without worrying about plant genotype.

Geoff Zahn, Josh Leon, Austen Miller inside Pando clone

Geoff Zahn, Josh Leon, Austen Miller inside Pando clone

In September 2023, we collected leaf samples from all over the Pando clone and extracted DNA from both the leaf surfaces and interiors. Those samples were sequenced on an Illumina MiSeq system (2x300bp). It’s there that we start our work. We will use bioinformatics tools to turn the raw DNA reads into fungal community data that we can work with and explore. We will test hypotheses about environmental filtering of foliar fungi.

Some questions we can ask:

  • Are foliar fungi reflective of their immediate environment? i.e., Do samples that are physically close to each other share more similar fungal communities than can be expected by chance?
  • Are there any “edge effects” in our communities? Are samples from the edge of the clone more similar to each other than those deeper inside the forest patch?
  • Epiphytes might ‘just be there’ but endophytes are living inside the plant tissue. So do we see contrasting patterns between the two categories of fungi? If so, this could be evidence for environmental filtering of endophytes. And if there’s no geographic structure in endophytes, it could suggest that the plants themselves are doing the filtering.

You will have lots of readings in this course.

We will use a shared Zotero library to keep track of all our papers

Start by finding and reading:

Darcy, J. L., Swift, S. O. I., Cobian, G. M., Zahn, G. L., Perry, B. A., & Amend, A. S. (2020). Fungal communities living within leaves of native Hawaiian dicots are structured by landscape-scale variables as well as by host plants. Molecular Ecology, 29(16), 3102–3115. https://doi.org/10.1111/mec.15544


Here’s an overview of the different sites where we sampled

read in csv of locations

display metadata in scroll box

include leaflet map of sites

Brief methods

Study sites



*Syringodium isoetifolium* (seagrass) meadow -- Image from Wiki commons

Syringodium isoetifolium (seagrass) meadow – Image from Wiki commons

Sample processing

16 samples of seagrass were taken from each of 12 locations. The V3 region of the 16S rDNA from each sample was amplified (N=192). These amplicons were sequenced on an Illumina MiSeq machine with V3 (2x300) chemistry along with 8 sample blanks (Total study N=200).

Data analysis

Here’s where we come in. We’ve got raw 16S sequence data from this study and need to process it, explore and visualize it, and test hypotheses. Our final codebase will be deposited as part of the publication, along with any figures and statistical results we develop.



Logistics

We will use R (and some Bash) along with Git/GitHub to conduct all of our work

Beyond the nitty gritty of coding, we will also be learning a lot about community ecology.

Some potential packages we will learn:

Here’s an example code archive for this type of work: Workshop Repository.

Here’s a BioProtocols paper walking through the workshop code: 16S Recipe –You’ll need to create a free account to download it




Expectations and evaluation

Grades will be based on assignments and code contributions

Assignments

During the semester, several assignments will be given related to the course material. Examples include:

  • Looking up and reporting on alternative parameters for certain functions
  • Finding and presenting papers about relevant topics
  • Coding assignments such as novel figure generation
  • Annotated bibliographies on background and discussion topics
  • In-class participation in discussion and hypothesis generation

Code contributions

Each student is expected to contribute to our final codebase. Comment lines denoting code authorship will be included in the final paper.

Writing

Each student is expected to contribute to writing, background reading/research/references, and editing. Students with low participation will not earn authorship on our paper, but grades will not be based on writing.




Working topics (subject to revision):

  • What is meta-amplicon technology?
  • The Earth Microbiome Project
  • Basics of community ecology
    • Who is there?
    • What are they doing?
    • How do they interact with each other?
    • How does the environment shape community structure?
    • Community assembly
    • Distributional ecology
  • Analytical methods in community ecology
    • Normalization / rarefaction
    • Alpha, beta, gamma diversity
    • Mantel / MRM / PermANOVA / Ordination / Networks
    • Differential tests
  • Technological methods and limitations




Weekly tasks and assignments

Assignment 1

  • Read a paper several times and compile questions

Assignment 2

  • Literature/resource search and annotation